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ABSTRACT 
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computer-assisted  instruction,  is  easier  and  can  be  done  by  personnel  who  are 
not  proficient  in  computer  programming  if  an  authoring  language  is  provided. 
The  Minnesota  Computerized  Adaptive  Testing  Language  (MCATL)  is  an 
authoring  language  specifically  designed  for  specifying  adaptive  tests.  Its 
fourteen  statements  can  be  grouped  into  five  functions:  test  division, 
administration  control,  scoring,  reporting,  and  customizing.  The  first  four 
categories  provide  statements  for  specifying  most  types  of  adaptive  tests  with 
minimal  programming  effort;  the  fifth  category  provides  an  interface  with 
standard  programming  languages  for  tests  that  cannot  be  directly  specified  in 
MCATL.  A  formal  specification  of  MCATL  in  Backus  Naur  Form  and  a 
practical  example  of  MCATL  are  provided. 
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INTRODUCTION 


Computerized  adaptive  testing  (CAT;  Weiss,  1978,  1980,  1985)  is  a  form  of 
psychological  testing  in  which  the  computer  chooses  the  items  that  are  most 
appropriate  for  each  examinee.  Like  more  conventional  paper-and-pencil  tests, 
computerized  tests  must  be  constructed  or  authored.  Paper-and-pencil  authoring 
requires  item  selection,  formatting,  and  typing.  An  author  may  choose  item 
cards  from  an  item  bank,  sort  them  into  the  desired  order,  and  give  them  to  a 
typist  to  produce  the  final  test.  Authoring  a  computerized  test  can  be 
considerably  more  complex.  In  the  early  days  of  computerized  testing,  tests 
were  authored  in  programming  languages  designed  for  general-purpose 
computation.  Items  were  coded  as  print  statements  in  programs.  Although  this 
process  resulted  in  satisfactory  tests,  it  required  long  programs  and  the  services 
of  a  computer  programmer. 

Computer-assisted  instruction  (CAI)  developed  simultaneously  with  CAT. 
Since  the  underlying  theory  and  processes  associated  with  CAI  were 
considerably  simpler  than  those  of  CAT,  CAI  became  commercially  viable  much 
earlier.  From  the  point  of  view  of  authoring  and  administration,  CAT  and  CAI 
have  some  similarities.  At  the  most  basic  level,  presentation  of  a  CAI 
instructional  screen  and  acceptance  of  a  examinee's  response  is  no  different 
from  presentation  of  a  test  item  and  acceptance  of  an  examinee’s  response. 

Like  CAT  tests,  CAI  instruction  must  be  authored  on  a  computer  before  it  can 
be  administered  to  examinees.  Because  the  use  of  CAI  was  so  widespread  and 
its  content  so  voluminous,  it  was  not  feasible  for  all  instruction  to  be  authored 
by  programmers.  Nor  was  it  desirable  from  the  instructional  author’s 
perspective  because  a  certain  amount  of  interactive  control  over  the  material  is 
lost  when  it  is  submitted  to  a  programmer.  To  make  it  possible  for  a  non- 
programmer  to  enter  instruction,  a  new  type  of  programming  language, 
typically  referred  to  as  an  authoring  language,  was  developed. 

In  essence,  all  authoring  languages  attempt  to  provide  the  instructional 
author  with  a  set  of  utilities  specifically  designed  for  creating  instruction.  A 
CAI  authoring  language  at  its  most  basic  level  has  to  provide  a  means  for 
presenting  instructional  material  on  a  computer  display,  accepting  a  response 
from  the  examinee,  assigning  his  or  her  response  to  one  of  a  finite  number  of 
categories,  and  mechanically  branching  as  a  function  of  the  response  given. 
These  functions  are  also  required  of  a  CAT  language. 

In  spite  of  the  similarities  between  CAI  and  CAT,  a  standard  CAI 
authoring  system  is  ill  suited  to  the  development  of  adaptive  tests  because  it 
has  many  features  that  are  inappropriate  for  CAT  and  it  lacks  several  features 
that  are  necessary  for  CAT.  An  authoring  system  developed  for  CAI  must 
provide  substantial  flexibility  in  the  item  formats  that  it  can  present  and 
considerable  latitude  in  the  types  of  responses  it  can  accept.  Motion  or 
animation  in  the  instructional  display  is  common  for  CAI.  Similarly,  a  free- 
response  format  is  common  for  responding  to  CAI.  Branching  and  instruction 
selection  is  very  rudimentary,  however;  the  branching  algorithms  for  CAI  arc 
rarely  more  complicated  than  a  simple  mechanical  branch  to  a  new 
instructional  item  based  on  the  response  to  the  previous  one.  Furthermore, 


instructional  sequences  are  usually  not  scored,  and  if  they  are,  scoring  is  very 
rudimentary. 

In  the  current  state  of  the  art,  adaptive  tests  rarely  require  complex  visual 
displays  and  are  typically  limited  to  a  multiple-choice  response  format.  This  is 
due  not  so  much  to  the  psychometrics  involved,  but  rather  to  the  long  tradition 
of  objective  testing  in  the  paper-and-pencil  medium,  in  which  limited  formats 
were  available.  It  will  probably  be  several  years  before  psychological  tests  are 
developed  to  effectively  exploit  the  computer’s  capabilities  for  more  complex 
item  types. 

Item  selection  and  scoring  for  CAT  are  considerably  more  complex. 

Because  each  examinee  receives  a  different  set  of  test  items,  the  conventional 
proportion-correct  score  is  inadequate.  In  effect,  a  good  adaptive  test  attempts 
to  choose  items  for  a  particular  examinee  for  which  he  or  she  has 
approximately  a  50%  chance  of  knowing  the  correct  answer.  If  the  testing 
procedure  is  effective,  there  will  be  virtually  no  variance  in  proportion-correct 
scores  among  examinees.  Scoring  is  thus  typically  built  upon  mathematically 
complex  and  computationally  tedious  statistical  models.  Depending  on  the 
purpose  of  the  test,  item  selection  may  be  as  simple  as  the  mechanical 
branching  of  a  CAI  program,  but  it  usually  involves  selecting  items  to  optimize 
a  computationally  tedious  statistical  function. 

The  MicroCATtm  Testing  System  is  one  of  a  small  number  of  CAT  systems 
available  for  complete  general-purpose  test  development.  It  may  be  the  only 
CAT  system  available  that  contains  a  complete  test  authoring  system  for  off- 
the-shelf  microcomputer  hardware.  MCATL,  the  Minnesota  Computerized 
Adaptive  Testing  Language,  is  the  authoring  language  used  by  the  MicroCAT 
system  to  specify  adaptive  tests.  This  report  describes  MCATL  and  the  process 
by  which  it  was  developed. 


LANGUAGE  DESIGN 

The  design  of  any  programming  environment  involves  trade-offs  between 
the  investment  in  developing  the  system  itself,  the  ease  of  program  development 
within  the  system,  and  the  efficiency  with  which  programs  developed  in  the 
system  will  execute.  For  minimal  investment  in  system  development  and 
maximum  efficiency  of  execution,  machine  language  is  ideal.  For  ease  of  use, 
high-level  metaphorical  programming  environments  are  best.  An  example  of  a 
metaphorical  programming  environment  is  one  in  which  the  computer  screen  is 
made  to  resemble  a  desk  top.  Within  the  metaphor,  the  user  can  accomplish 
tasks  such  as  shuffling  papers  on  his  or  her  desk  with  relative  ease.  The 
problem  with  such  high-level  environments  is  that  they  are  usually  limited  to 
functions  within  their  metaphors  (in  this  case,  what  can  be  done  on  the  top  of 
a  desk),  they  are  very  expensive  to  develop,  and  they  require  considerable 
computer  power  to  execute.  Between  the  two  extremes  of  machine  language 
and  metaphorical  programming  environments  are  high-level  programming 
languages  and  various  kinds  of  authoring  languages  like  MCATL. 


The  primary  objective  in  the  design  of  MCATL  was  to  provide  a 
reasonably  friendly  environment  for  authoring  tests.  It  had  to  provide  a 
framework  of  functions  for  performing  authoring  tasks.  The  design  of  MCATL 
thus  began  with  an  analysis  of  the  authoring  process,  which,  in  a  computerized 
mode,  amounts  to  structuring  the  presentation  of  test  items  and  specifying  how 
responses  to  the  items  are  to  be  processed. 

Test  administration  can  be  characterized  by  three  functions:  presentation, 
scoring,  and  reporting.  Presentation  includes  item  selection  and  test 
termination.  The  criteria  by  which  items  should  be  selected  or  the  rules  for 
mechanically  branching  from  item  to  item  or  subtest  to  subtest  must  be 
determined.  Additionally,  a  test  termination  rule  (a  logical  decision  rule)  to  be 
evaluated  after  each  item  is  administered  must  be  chosen  to  determine  when 
testing  should  stop. 

Vale  (1981)  suggested  that  the  majority  of  CAT  strategies  could  be  grouped 
into  a  simple  structural  taxonomy  with  three  main  categories:  inter-item 
branching,  inter-subtest  branching,  and  model-based  branching.  In  a  test  based 
on  inter-item  branching,  items  are  selected  by  branching  from  the  last  item 
administered  to  one  of  two  new  items  based  solely  on  whether  the  response  to 
the  preceding  item  was  correct  or  incorrect.  In  a  test  based  on  inter-subtest 
branching,  the  branching  is  from  one  set  of  items  to  another  set  of  items  based 
on  a  score  from  the  previous  set  of  items.  Vale  further  subdivided  this 
category  into  reentrant  and  non-reentrant  branching.  In  a  non-reentrant 
branching  strategy,  an  entire  subtest  is  administered  before  branching  occurs. 

In  a  reentrant  strategy,  only  one  item  within  a  subtest  is  administered  when 
that  subtest  is  branched  to.  Each  time  a  subtest  is  branched  to,  the  next 
sequential  unadministered  item  is  administered.  The  final  category,  model- 
based  branching,  selects  items  by  searching  the  entire  item  pool  to  determine 
which  item  optimizes  the  statistical  criterion  function. 

MCATL  evolved  from  TCL  (Test  Control  Language),  an  early  prototype 
language  for  specifying  adaptive  tests  (Vale,  1981).  MCATL  was  designed  to 
more  closely  follow  the  structure  of  Vale’s  taxonomy  and  to  eliminate  several 
shortcomings  of  TCL.  First,  because  TCL  did  not  have  a  subtest  structure,  it 
was  difficult  to  administer  several  independent  tests  within  a  testing  session.  A 
second  test  could  be  included  only  by  explicitly  resetting  all  of  the  pointers  and 
the  scores.  Second,  inter-subtest  branching  was  unstructured  and  could  be 
accomplished  only  with  conditional  branches  to  labels  in  the  test  specification. 
This  was  especially  clumsy  for  reentrant  inter-subtest  branching  in  which 
pointer  variables  had  to  be  explicitly  set  and  re-set  by  the  test  author. 

Although  the  process  worked,  the  resulting  specification  was  difficult  to  read 
and  required  a  fair  amount  of  programming  skill.  Third,  there  was  no 
separation  of  scoring  functions  from  score  variables.  For  example,  a  test  could 
not  use  separate  Bayesian  priors  for  item  selection  and  scoring.  Fourth,  TCL 
had  very  limited  reporting  capabilities;  the  report  produced  was  a  fixed-format 
numeric  listing  of  scores.  Furthermore,  it  was  difficult,  if  not  impossible,  to 
get  scores  printed  after  each  item  was  administered  rather  than  at  the  end  of 
the  test.  Fifth,  TCL  had  no  text  capabilities  available  for  interpreting  the  test 
scores  for  the  examinee.  Finally,  TCL  offered  no  opportunity  for 


customization;  strategies  that  could  not  be  specified  in  TCL  could  not  be 
developed  within  the  TCL  system. 


LANGUAGE  DESCRIPTION 

MCATL  is  a  line-oriented  language.  All  80  characters  of  a  line  are  used, 
and  an  attempt  to  read  the  81st  character  is  an  end  of  line.  If  an  ampersand 
appears  in  a  line,  the  remaining  characters  are  ignored,  and  the  next  line  is 
scanned  as  a  continuation  of  the  previous  line.  The  exclamation  point  has  the 
same  effect  as  an  end  of  line,  and  the  remaining  characters  are  ignored. 

Figure  1  shows  the  formal  syntax  of  MCATL  in  Backus  Naur  Form  (BNF; 
cf.,  Jensen  &  Wirth,  1974).  The  functions  of  the  statements  and  their  uses  are 
described  in  detail  in  the  User’s  Manual  for  the  MicroCAT  Testing  System 
(Assessment  Systems  Corporation,  1984).  The  statements  are  briefly  described 
here.  The  14  MCATL  statements  can  be  grouped  into  five  functions:  test 
division,  administration  control,  scoring,  reporting,  and  customizing. 


Figure  1.  EBNF  Description  of  MCATL 

<test>  "TEST"  <name>  <term> 

<statements>  "ENDTEST"  <term> 
<name>  <1-6  letters> 

<term>  ::=  <end  of  Iine>  |  T 

<statements>  <statement>  (<statements>) 

<statement>  [<label>  " "] 

[  <test>  | 

<item  statement  | 
terminate  statement  | 

<if  statement  | 

<jump  statement>  | 

<search  statement  | 

<sequence  statement>  | 

<review  statement  | 


4 


<label> 

<item  statement 

<itcm  number> 
<item  specifio 
<branch  clause> 


<corrcct  branch> 


<incorrect  branch> 


<alternative  branch> 


<crror  branch> 


<set  statement  | 

<setscore  statement  | 

<kcep  statement  | 

<autokcep  statement  | 

<interpret  statement  | 

<custom  statement  | 

<null>  ]  <tcrm> 

"$"  <name> 

::*  *#"  <item  number>  [cbranch  clauso] 
[characteristic  clauso] 

<name>  <item  specifio 
::-  <1*3  digits,  not  all  zero> 

correct  branch>  <incorrect  branch> 
[error  branch>]  | 

<alternative  branch>  [error  branch>  ]  | 
error  branch>  | 

<unconditional  branch>  [error  branch>] 
::=  "CO:"  <label>  | 

"CORRECT:"  <label> 

::-  "IN:"  <label>  | 

"INCORRECT:"  <label> 

::-  ("AL:"  |  "ALTERNATIVE:")  "A="  <label> 

[  "B-"  <!abel>  [  "C-"  <label>  [  "D-"  <labcl> 
[  "E-*  <label>  [  "F-:  <Iabcl>  ]  ]  ]  ]  ] 

::-  "ER:"  <label>  | 


ERROR:"  <label> 


•(unconditional  branch>  :: 


PROCEED:"  <label>| 
’PR:"  <label> 


<characteristic  clause> 
<characteristic> 


terminate  statement 
<logical  expression> 

<relationship> 

•(logical  operator> 
(relational  operator> 
<if  statement:* 

<jump  statement* 
(search  statement* 

(value* 

(var* 

(con* 


>  (characteristic*  [(characteristic  cause*] 

"SLIMIT:"  (value*  | 

"RLIMIT:"  (value*  | 

("CLEAR"  |  "NOCLEAR")  | 

"KEY:"  (value* 

:=  "TERMINATE"  [(label*]  (logical  expression* 

:=  ["NOT"]  ["("]  (relationship*  (logical  operator* 
((relationship*  |  (logical  expression*)  [”)"] 

:=  (arithmetic  expression*  (relational  operator* 
(arithmetic  expression* 

=  ("AND"  |  "OR"* 

-  -<"  1  "*"  |  "«"  |  "(-"  |  "*»"  |  "o" 

*  "IF"  (logical  expression*  (term*  (statements* 
("ELSEIF"  [(logical  expression*]  (term* 
(Statements*)  "ENDIF" 

:*  "JUMP  (label*  [(logical  expression*]  (term* 

:■  "SEARCH"  (value*  (term* 

(item  statement*  {(item  statement*} 
"ENDSEARCH" 

■  (var*  |  (con* 

*  l"@"]  (letter*  (0-9  letters  or  digits* 

*  (integer  or  real  number* 

-  "SEQUENCE"  [branch  clause]  [  "ONEND:”  (label*  ] 
[(characteristic  clause*]  (term*  (item  statement* 


(sequence  statement* 


<tcrm>  {<item  statement  <term>} 
"ENDSEQUENCE" 

<review  statement  ::=  "REVIEW"  {<review  option>} 

<rcvicw  option>  "ALL"  |  "SKIPPED"  |  "SPECIFIC" 

<dorcview  statement  ::=  "DOREVIEW" 

<set  statement  "SET"  <var>  "="  <arithmetic  expression> 

<expression>  ::=  "("  <logical  expression>  | 

<arithmetic  expression>  ")" 

<arithmetic  expression>  ::=  ["("]  {"-"}  <value>  [<arithmetic  operator> 

(<valuc>  |  arithmetic  expression>)]  [")"] 
<arithmetic  operator>  ::=  "+"  |  "-"  |  "*"  |  "/" 

<setscore  statement>  "SETSCORE"  <scorelist> 

<scorelist>  <score  function>  "<"  <value  list>  ")" 

{<score  function>  "("  <value  list>  ")"} 

<score  function>  "BANK-IDENTIFIER"  | 

"BAYESIAN"  | 

"CSCORE-1:"  j  ...  |  "CSCORE-9"  | 

"KEY"  | 

"LATENCY"  | 

"MAXIMUM-LIKELIHOOD"  | 
"MODAL-BAYESIAN"  | 
"NUMBER-ADMINISTERED"  | 
"NUMBER-CORRECT"  | 
"PROPORTION-CORRECT"  | 
"RESPONSE"  | 


TIME 


<value  list> 


::=  {<value>} 


<keep  statement 
<write  list> 

<string> 

<autokeep  statement 
<interpret  statement> 

<interpret  clause> 
<custom  statement> 


"KEEP"  <write  list> 

(<value>  |  <string>)  {write  list} 

"""  <alphabetic  character> 
{<alphabetic  character>}  BMM 

"AUTOKEEP"  <write  list> 

"INTERPRET"  "("  <interpret  clause> 
{<interpret  clause>)  ")" 

<string>  |  "A"  |  <var>  |  <label> 

"PROC-1"  |  ...  1  "PROC-5" 


Test  Division 

The  entire  MCATL  test  specification  is  a  test  statement.  Subtests  within 
the  main  test  are  also  test  statements.  A  test  statement  consists  of  the  word 
TEST,  a  test  name,  a  group  of  test  specification  statements,  and  the  word 
ENDTEST.  All  scoring  functions  are  local  to  the  test  in  which  they  are  listed. 
Similarly,  variables  are  local  to  the  test  unless  they  are  explicitly  declared  as 
global.  Thus,  any  number  of  independent  subtests  can  be  included  within  the 
main  test.  Variable  names  and  scoring  functions  can  be  reused  within  each  test 
without  interference  from  other  tests. 

Subtests  can  be  nested  within  tests  and  other  subtests  to  an  arbitrary 
degree.  However,  this  nesting  is  not  hierarchical,  as  it  is  in  Pascal,  because 
nested  subtests  arc  independent  of  the  tests  in  which  they  are  nested,  except  for 
the  global  variables  they  contain.  Whereas  a  nested  Pascal  procedure  has  access 
to  all  variables  declared  in  superordinate  procedures,  MCATL  subtests  do  not. 
Nesting  in  MCATL  is  available  only  to  allow  the  administration  of  independent 
subtests  before  the  completion  of  a  current  test. 

Administration  Control 

Eight  statements  make  up  the  administration  control  category.  The  item 
statement  specifies  that  an  item  should  be  administered  or  included  in  a 
structure.  An  item  statement  consists  of  a  #  sign,  an  item  number  and, 
optionally,  a  branch  clause  and  a  characteristic  clause.  The  branch  clause  to 
the  item  statement  specifics  where  the  item  should  branch  on  the  specified 
conditions.  The  characteristic  clause  can  override  certain  characteristics  of  the 
item,  such  as  whether  the  screen  should  be  cleared  before  the  item  is  presented. 
The  branch  clause  is  provided  to  allow  inter-item  branching  strategies.  The 
characteristic  clause  override  was  provided  as  a  convenience,  but  it  has  been  of 


,  JV  Vj  J 


v-%  oM-'1  &_>  . 


little  practical  use  and  was  therefore  not  documented  in  the  user’s  manual.  It 
may  be  deleted  from  later  versions  of  the  system. 

The  TERMINATE  statement  specifies  the  conditions  under  which  testing 
should  stop.  It  consists  of  the  word  TERMINATE,  an  optional  label  to  branch 
to  upon  termination,  and  a  logical  expression  that  describes  the  condition  under 
which  termination  should  occur.  The  TERMINATE  expression  is  evaluated 
after  each  item  that  accepts  a  response  is  administered.  Typically,  only  the 
administration  of  an  item  will  change  the  variables  contained  in  a 
TERMINATE  expression.  The  label  was  included  as  an  afterthought. 

Typically,  a  score  is  written  to  the  output  file  after  the  termination  criterion  is 
satisfied.  Without  the  label,  the  flow  of  control  in  the  test  specification 
transferred  to  the  ENDTEST.  Local  variables  disappeared,  and  a  score  was 
available  to  write  out  only  if  it  was  stored  in  a  global  variable.  The  label  field 
allowed  for  the  transfer  of  control  to  an  output  statement,  typically  near  the 
end  of  the  test. 

The  IF  statement  provides  a  structure  that  is  useful  for  non-reentrant  inter¬ 
subtest  branching.  It  begins  with  the  word  IF  and  a  logical  expression 
describing  the  conditions  under  which  the  first  subtest  will  be  administered. 
Additional  subtests  are  separated  from  the  first  subtest  by  ELSEIF  clauses, 
which  are  similar  to  the  original  IF  clause  except  that  the  word  ELSEIF  is 
substituted.  The  IF  statement  ends  with  the  word  ENDIF.  Although  the  initial 
IF  statement  requires  a  logical  expression,  the  logical  expression  is  optional  for 
the  ELSEIF  clauses.  If  the  logical  expression  is  omitted,  the  ELSEIF  clause  will 
be  executed  unconditionally.  Only  the  first  IF  or  ELSEIF  clause  whose  logical 
expression  is  satisfied  will  be  executed.  Thus  the  IF  and  all  ELSEIFs  except 
the  last  one  should  be  accompanied  by  a  logical  expression.  The  use  of  the  IF- 
ELSEIF-ENDIF  structure  is  not  limited  to  the  selection  of  tests,  however.  It  can 
also  be  used  to  separate  any  collection  of  statements.  However,  control  cannot 
be  transferred  into  or  out  of  the  IF-ELSEIF-ENDIF  structure. 

The  JUMP  statement  provides  a  convenient  unstructured  alternative  to  the 
IF  structure  for  progressing  through  a  test.  It  is  intended  for  applications  for 
which  the  structured  statements  are  too  cumbersome.  When  it  is  used  without  a 
logical  expression,  it  provides  an  unconditional  transfer  of  control.  With  a 
logical  expression,  it  provides  a  conditional  transfer  of  control.  It  cannot  be 
used  to  branch  outside  of  the  test  block  in  which  it  appears  nor  can  it  be  used 
to  branch  into  the  middle  of  a  statement  (e.g.,  an  IF  statement). 

The  SEARCH  statement  initiates  a  model-based  testing  strategy.  Given  a 
value  to  search  on,  it  chooses  the  unadministered  item  that  has  the  most 
psychometric  information  at  that  value.  If  the  value  provided  is  a  score,  the 
search  implements  an  adaptive  test.  If  the  value  provided  is  a  constant,  it 
administers  a  conventional  test.  The  searched  test  will  continue  until  all  items 
have  been  administered  or  until  the  termination  criterion  specified  in  the 
TERMINATE  statement  has  been  satisfied. 

The  SEQUENCE  statement  is  used  for  reentrant  inter-subtest  branching.  A 
sequence  contains  a  set  of  items.  Each  time  the  sequence  is  executed,  the  next 
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item  in  the  sequence  is  administered.  Branching  occurs  from  the  SEQUENCE 
statement  after  the  item  is  administered.  An  ONEND  branch  is  provided  for 
the  condition  in  which  the  items  in  the  sequence  are  exhausted.  If  testing 
exhausts  a  sequence  that  has  no  ONEND  branch,  control  falls  through  the 
ENDSEQUENCE  to  the  next  statement  in  the  specification. 

The  REVIEW  statement  allows  examinees  to  review  items  in  one  of  three 
ways.  The  examinee  can  review  all  of  the  items,  those  items  he  or  she  skipped, 
or  specific  items  identified  by  a  sequence  number.  If  the  review  option  is  in 
effect,  the  review  process  will  automatically  be  explained  at  the  end  of  the  test 
and  the  examinee  will  be  given  the  opportunity  to  use  it.  When  a  response  is 
changed,  all  scores  are  recalculated.  If  a  score  is  sequence-dependent,  the  new 
response  will  be  substituted  for  the  old  one.  Thus,  the  score  will  be  the  same  as 
if  the  response  given  in  the  review  had  been  given  initially. 

The  final  administration  control  statement  is  the  DOREVIEW  statement. 
The  review  process  is  normally  initiated  when  the  ENDTEST  statement  is 
reached.  Often,  however,  it  must  be  initiated  before  the  ENDTEST  (for 
example,  to  write  post-review  scores  to  the  data  file).  Like  the  ENDTEST,  the 
DOREVIEW  statement  initiates  the  review  process,  but  since  the  test  does  not 
end,  all  local  variables  are  still  available  after  the  review. 

Scoring 

The  SET  statement  is  a  simple  assignment  statement  with  which  the  value 
of  an  expression  can  be  assigned  to  a  variable.  It  can  be  used  to  initialize  score 
variables  and  to  transform  scores. 

The  SETSCORE  statement  selects  the  scoring  functions  that  are  to  be  used 
in  a  test  or  subtest  and  assigns  constants  or  variable  names  to  the  parameters  of 
the  scoring  functions.  A  scoring  function  can  be  listed  more  than  once  in  a 
SETSCORE  statement,  but  variable  names  cannot.  The  SETSCORE  statement 
only  sets  scores  up  to  be  calculated.  No  scores  are  calculated  until  the  variables 
assigned  as  parameters  are  used. 

Reporting 

Three  reporting  statements  are  provided:  the  KEEP  statement,  the 
AUTOKEEP  statement,  and  the  INTERPRET  statement.  The  KEEP  statement 
is  a  simple  file-write  statement  that  causes  variables  and  text  to  be  written  to  a 
permanent  file,  XTESTEEX.KEE.  The  KEEP  statement  writes  a  line  to  the  file 
every  time  it  is  executed  in  the  test  specification. 

The  AUTOKEEP  statement  is  similar  to  the  KEEP  statement  but  it  is 
executed  every  time  an  item  is  administered  and  a  response  is  accepted.  The 
AUTOKEEP  statement  is  particularly  useful  when  intermediate  results,  such  as 
item  responses,  are  to  be  kept  in  a  model-based  branching  strategy  using  the 
SEARCH  statement.  It  also  provides  a  convenient  way  of  keeping  response 
data  even  if  the  SEARCH  statement  is  not  used. 


The  INTERPRET  statement  provides  a  means  of  writing  a  narrative 
interpretation  that  can  be  given  to  the  examinee  to  explain  the  test  results.  It 
is  similar  in  function  to  the  KEEP  statement  but  provides  formatting 
capabilities  for  extended  text  production.  It  also  writes  to  a  separate  file, 
XTESTEEX.INT.  In  operational  use,  the  KEEP  and  AUTOKEEP  statements 
are  usually  used  to  keep  data,  and  the  INTERPRET  statement  is  used  to 
provide  a  report  for  the  examinee. 

Customizing 

The  five  custom  statements  branch  to  one  of  five  possible  custom 
processing  procedures  written  by  the  test  developer  in  a  programming  language 
such  as  Pascal  or  FORTRAN.  When  executed,  these  statements  transfer  control 
to  the  user’s  procedure,  which  is  responsible  for  all  processing  and  for  the 
return  of  control  to  the  testing  system  when  it  is  done. 

Example 

Figure  2  shows  an  example  of  a  test  specification.  The  TEST  statement 
names  the  test  SAMPLE.  The  SETSCORE  statement  indicates  that  two  scores 
are  to  be  computed.  The  first  one,  the  Bayesian  score,  uses  the  variables  MEAN 
and  VAR  for  the  posterior  means  and  variances.  The  prior  mean  and  variance 
are  set  at  zero  and  one,  respectively.  The  variable  NUMADMIN  is  assigned  to 
the  number*administered  scoring  function  and  will  contain  a  count  of  the 
number  of  items  that  have  been  administered. 

The  TERMINATE  statement  establishes  the  conditions  for  terminating  the 
test  and  identifies  the  statement  to  branch  to  when  the  conditions  are  satisfied. 
This  test  will  terminate  when  the  Bayesian  posterior  variance  becomes  less  than 
0.1  and  more  than  five  items  have  been  administered.  Upon  termination, 
control  will  transfer  to  the  statement  labelled  STERMOK. 

The  REVIEW  option  is  activated  by  the  REVIEW  statement.  In  this 
example,  the  examinee  will  be  allowed  to  review  items  that  he  or  she  skipped 
on  the  first  pass. 

A  flexitevel  testing  strategy  is  implemented  in  Test  SAMPLE.  In  this 
example,  items  are  ordered  by  difficulty.  Item  6  is  the  easiest  and  Item  IS  is 
the  hardest.  The  strategy  is  implemented  using  two  SEQUENCE  statements 
labeled  SEQl  and  SEQ2.  In  sequence  one,  items  start  at  a  medium  difficulty 
and  become  easier  further  into  the  sequence.  In  sequence  two,  the  items  again 
start  at  a  medium  difficulty,  but  become  more  difficult.  Testing  begins  in 
sequence  one.  Following  a  correct  response  to  sequence  one,  testing  branches  to 
sequence  two.  After  an  incorrect  response  to  sequence  one,  the  next  item  in 
sequence  one  is  chosen.  When  sequence  one  runs  out  of  items,  the  testing 
process  branches  to  the  label  SHITEND.  The  branching  instructions  for 
sequence  two  are  the  same  as  for  sequence  one.  As  a  result  of  these  branching 
instructions,  examinees  who  answer  a  majority  of  the  items  correctly  will  be 
given  more  items  from  sequence  two,  while  examinees  who  answer  a  majority 
incorrectly  will  be  given  more  items  f rom  sequence  one. 


TEST  SAMPLE 

SETSCORE  BAYESIAN(MEAN,  VAR,  0,  1),& 
NUMBER-ADMINISTERED(NUMADMIN) 

TERMINATE  STERMOK  ((VAR  <  0.1)  AND  (NUMADMIN  >  5)) 

REVIEW  SKIPPED 

SSEQ1  SEQUENCE  CORRECT:  SSEQ2,  INCORRECT:  $SEQ1,& 
ONEND:  SHITEND 

#ITEM10 

#ITEM9 

#ITEM8 

#ITEM7 

#ITEM6 

ENDSEQUENCE 

SSEQ2  SEQUENCE  CORRECT:  SSEQ2,  INCORRECT:  SSEQ1.& 
ONEND:  SHITEND 

#ITEM1 1 
#ITEM12 
#ITEM13 
#ITEM14 
#ITEM15 
ENDSEQUENCE 

SHITEND  KEEP  ("TEST  TERMINATED  BECAUSE  IT  RAN  OUT  OF 
ITEMS  AFTER  ",  NUMADMIN,  "  HAD  BEEN  GIVEN.") 

STERMOK  KEEP  ("THE  SCORE  BEFORE  REVIEW  WAS  ",  MEAN) 

DOREVIEW 

KEEP  ("THE  SCORE  AFTER  REVIEW  WAS  ",  MEAN) 

ENDTEST 


The  KEEP  statement  labeled  SHITEND  inserts  a  message  into  the 
permanent  data  file  indicating  that  the  test  terminated  because  it  ran  out  of 
items.  The  KEEP  statement  labeled  STERMOK  inserts  a  line  in  the  permanent 
data  file  indicating  the  Bayesian  posterior  mean  before  any  items  are  reviewed. 
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The  DOREVIEW  statement  initiates  the  review  process.  When  this 
statement  is  executed,  the  examinee  is  given  the  option  of  reviewing  the  items 
that  he  or  she  skipped  during  the  first  pass. 

The  KEEP  statement  following  the  DOREVIEW  statement  writes  the  final 
score  to  the  permanent  file.  It  says,  "The  score  after  review  was",  and  prints 
the  value  of  the  Bayesian  posterior  mean.  This  score  is  calculated  using  the 
new  responses  scored  in  the  order  of  the  original  item  presentation. 

The  ENDTEST  ends  the  sample  test.  If  the  DOREVIEW  statement  had  not 
been  included  within  the  test,  the  ENDTEST  statement  would  execute  the 
review. 

LANGUAGE  IMPLEMENTATION  AND  EVALUATION 

A  compiler  for  MCATL  was  implemented  on  an  IBM  PC  as  part  of  the 
MicroCAT  Testing  System.  It  compiles  the  MCATL  specification  to  an 
intermediate  code  from  which  the  test  can  be  administered,  and  at  the  same 
time  performs  as  many  of  the  computations  as  possible  that  will  be  needed  for 
administration.  Test  items  are  randomly  accessible  in  the  test  file.  In  addition, 
graphic  test  items,  which  are  stored  in  the  item  banks  as  strings  of  graphic 
commands,  are  translated  into  pixel  representation  and  compressed  for  rapid 
presentation  during  testing.  Furthermore,  search  tables  are  set  up  so  that 
information-based  item  selection  can  proceed  rapidly  with  only  a  table  look-up 
and  no  computations  during  the  testing  session. 

In  the  MicroCAT  implementation,  an  MCATL  pre-processor  was  developed 
to  provide  a  simpler,  more  user-friendly  interface  for  the  test  developer  who  is 
relatively  unfamiliar  with  programming  languages.  This  pre-processor  creates 
MCATL  test  specifications  from  a  test  template  and  user-supplied  information. 
An  MCATL  template  typically  consists  of  an  MCATL  test  specification  with 
several  blanks  for  the  user  to  fill  in.  These  blanks  may  be  filled  with  item 
reference  numbers,  termination  criteria,  etc.  In  addition  to  the  MCATL 
statements  and  the  blanks,  INSTRUCT  statements  are  usually  included  in  the 
template.  The  INSTRUCT  statements  are  displayed  by  the  pre-processor  as  they 
are  encountered  and  are  used  to  inform  the  user  what  information  the  pre¬ 
processor  is  requesting.  The  output  of  the  MCATL  pre-processor  is  an  MCATL 
test  specification.  Like  any  MCATL  test  specification,  this  has  to  be  compiled 
before  the  test  can  be  administered. 

The  MCATL  compiler  of  the  MicroCAT  system  has  been  tested  internally 
for  nearly  two  years,  and  it  has  been  field  tested  for  more  than  one  year. 
Although  the  language  and  the  compiler  have  not  been  formally  evaluated,  first 
users  have  indicated  that  they  are  pleased  with  its  power  and  simplicity  in 
setting  up  adaptive  tests.  Three  levels  of  involvement  were  provided  with  the 
system:  the  template  level,  the  MCATL  level,  and  the  Pascal  level.  All  three 
levels  are  currently  being  used.  The  template  level  provides  very  fast 
development  of  tests  using  standard  strategies.  The  MCATL  level  allows 
specification  of  the  more  common  adaptive  testing  strategies.  The  Pascal  level 
has  provided  a  necessary  extension  for  some  new  adaptive  strategies  that  were 


not  envisioned  at  the  time  the  language  was  designed  and  are  not  yet 
commonplace  enough  to  warrant  their  inclusion  in  the  language. 

Perhaps  the  most  noticeable  shortcoming  of  MCATL  is  its  occasional 
tendency  to  provide  features  that  are  not  useful.  One  example  of  this  may  be 
the  test  nesting  capability.  MCATL  was  designed  to  allow  several  levels  of 
nesting  of  subtests  within  tests.  However,  this  is  rarely  done;  in  fact,  tests  are 
usually  not  nested  at  all.  The  first  test  may  consist  of  instructional  items,  the 
second  test  may  consist  of  measurement  items,  and  a  third  test  may  consist  of 
feedback  items.  These  tests  are  independent,  however,  and  do  not  need  to  be 
nested.  Similarly,  the  capability  to  override  item  characteristics  in  the  item 
statement  was  included  in  the  original  specification.  In  fact,  this  will  probably 
never  be  done.  This  feature  was  not  documented  in  the  MicroCAT  manual  and 
may  be  removed  from  future  versions  of  the  language.  The  advantage  of 
eliminating  these  extraneous  capabilities  will  be  a  reduction  in  size  and  an 
enhancement  in  speed  of  the  testing  system. 

In  general,  MCATL  has  functioned  well.  The  separate  compilation  phase 
has  proved  very  advantageous  in  terms  of  execution  speed.  Whereas  it  may 
take  as  much  as  one  minute  per  item  to  compile  items  with  very  complex 
graphics,  the  time  required  to  display  the  item  during  execution  is  uniformly 
about  half  a  second.  Similarly,  although  it  may  take  several  minutes  to 
construct  a  search  table  for  a  mathematically  branched  testing  strategy,  the 
time  it  takes  to  retrieve  an  item  during  testing  is  uniformly  less  than  two 
seconds,  regardless  of  the  testing  strategy  used. 
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